A systematic investigation of detectors for low signal-to-noise ratio EMG signals

Background Active participation of stroke survivors during robot-assisted movement therapy is essential for sensorimotor recovery. Robot-assisted therapy contingent on movement intention is an effective way to encourage patients’ active engagement. For severely impaired stroke patients with no residual movements, a surface electromyogram (EMG) has been shown to be a viable option for detecting movement intention. Although numerous algorithms for EMG detection exist, the detector with the highest accuracy and lowest latency for low signal-to-noise ratio (SNR) remains unknown. Methods This study, therefore, investigates the performance of 13 existing EMG detection algorithms on simulated low SNR (0dB and -3dB) EMG signals generated using three different EMG signal models: Gaussian, Laplacian, and biophysical model. The detector performance was quantified using the false positive rate (FPR), false negative rate (FNR), and detection latency. Any detector that consistently showed FPR and FNR of no more than 20%, and latency of no more than 50ms, was considered an appropriate detector for use in robot-assisted therapy. Results The results indicate that the Modified Hodges detector – a simplified version of the threshold-based Hodges detector introduced in the current study – was the most consistent detector across the different signal models and SNRs. It consistently performed for ~90% and ~40% of the tested trials for 0dB and -3dB SNR, respectively. The two statistical detectors (Gaussian and Laplacian Approximate Generalized Likelihood Ratio) and the Fuzzy Entropy detectors have a slightly lower performance than Modified Hodges. Conclusions Overall, the Modified Hodges, Gaussian and Laplacian Approximate Generalized Likelihood Ratio, and the Fuzzy Entropy detectors were identified as the potential candidates that warrant further investigation with real surface EMG data since they had consistent detection performance on low SNR EMG data.

Any reports and responses or comments on the article can be found at the end of the article.

Introduction
Active patient participation during movement therapy is an essential ingredient for sensorimotor recovery after a stroke.About 30% of stroke survivors are severely impaired 1,2 and require physical assistance to actively engage in movement training.Robotic assistance can motivate such subjects to attempt and train movements and when it is contingent on the intention to move it may be an effective way to guide neuroplasticity. 3wever, it can be difficult to detect the intention to move in stroke patients with no visible residual movement.Electroencephalogram (EEG) based brain-computer interface (BCI) has been used to detect movement intention to trigger robotic movement assistance.There are, however, several drawbacks to EEG-BCI systems.They often exhibit a poor signal-to-noise ratio, with significant trial-to-trial intra-subject variability. 4EEG-BCI modalities lack task specificity, 5 and their complexity and time-consuming nature make them less suitable for routine clinical use. 6rface Electromyogram (sEMG) could be a viable alternative to address these drawbacks of the EEG-BCI modality for robot-assisted therapy.sEMG is a simple, robust, compact modality suitable for routine clinical use.In a recent study, we identified sEMG as a potential alternative to EEG-BCI to detect movement intention from severely affected stroke patients without visible residual movement. 6The lack of visible movement in severely affected stroke patients can be because of co-contraction, increased joint stiffness, etc. Residual EMG could still be present in this patient group if weak neural commands from the brain can reach the target muscles and elicit muscle activation, which may not be sufficient to cause visible movements.About 70% (22 out of the 30) of the study participants had residual sEMG in the forearm muscles that showed a consistent increase in amplitude with wrist/finger movement attempts.
However, our study reported poor agreement between the EEG and sEMG modalities for detecting movement intention.
The authors suggested that this discrepancy could be because of the simple root mean square detector with temporal thresholding that was used.This detector may not optimally pick up low SNR sEMG signals 6 expected from severe patients with no visible residual movements.
][9][10][11][12][13][14] The review article by Staude et al. 7 compared different sEMG detector types to identify the best detectors for detecting sEMG onset. 7owever, this work was done on high SNR (3 dB and 6 dB) simulated sEMG signals generated by bandpass-filtering white Gaussian noise, and it only investigated detectors reported till 2001.Other detectors have been reported in the last 20 years, and a systematic characterization of existing sEMG detector types on low SNR sEMG signals (generated using different signal models) is lacking.Identifying an optimal detector is essential for further exploring the use of sEMGbased movement intent detection for robot-assisted therapy in severely affected stroke subjects with no visible residual movements.
Our goal in this study is to systematically compare the detection accuracy and latency of existing sEMG detectors on low SNR sEMG signals to identify the most promising detectors, while eliminating the ones with poor performance, for further investigation.To this end, we will here: • Generate simulated low SNR (0 dB and -3 dB) sEMG data using two phenomenological (Gaussian and Laplacian) models and a biophysical model to evaluate the performance of the different detector types.
• Define an appropriate cost function considering the detection accuracy and latency to evaluate the performance of the different detector types.
• Compare the performance of the different detector types on simulated sEMG signals from the three signal models for two different SNRs (0 dB and -3 dB) and identify the most appropriate detector type(s).

REVISED Amendments from Version 3
We have made corrections based on the suggestions from the reviewer comments on our paper.The terminology "no residual movement" is now changed to "no visible residual movement" as suggested.Formal corrections to the references as suggested were also made.

Methods
Neurorehabilitation training consists of repeating specific movements of interest punctuated by periods of rest. 15,16 typical training session will involve several "trials" of a particular movement, each with a period of rest (rest-phase) followed by a period of movement/movement attempt (move-phase).In sEMG-driven robot-assisted therapy, the robot remains inactive during the rest phase, while in the move phase robot-assisted movement of the subject's limb is contingent upon the presence or absence of sEMG at any given time instant; continued sEMG is required to continuously receive robotic assistance.The rest of this section starts with the formal definition of the signal processing problem solved by an sEMG detector, followed by the details of the simulated sEMG signals, a description of the general structure of sEMG detectors, and the approach used to compare the performance of the different detectors.
The formal definition of the signal processing problem Let x i n ½ ,0 ≤ n < N t be the recorded signal from a target muscle during the i th trial; n is the sampling instant, where n ¼ 0 is the start of a trial, N t is the number of data points from each trial.Let N r and N m be the number of samples in the rest-and move-phases of a trial, respectively, then N t ¼ N r þ N m .The time segments 0 ≤ n < N r and N r ≤ n < N t correspond to the rest-phase and move-phase of the trial, respectively.
Problem definition: To detect the presence of EMG in real-time in the move-phase of a trial using only the current and past EMG data x k ½ f g,0 ≤ k ≤ n from the start of the trial.Let y n ½ represent the binary output of the sEMG detector at the current sampling instant n, where, D Á ð Þ is the detector function that maps the sEMG signal x k ½ f g n k¼0 , from the start of a trial to a binary output corresponding to the presence or absence of sEMG at the current time instant n; p is the set of detector parameters that control the detector's performance.The function D Á ð Þ is often a complex mathematical operation consisting of a series of simpler operations performed on the sEMG data to produce the binary output.This binary output can be used as a simple on/off control of robotic assistance by severely affected patients to relearn movement initiation. 17,18mulation of surface EMG signal The analysis of the different sEMG detectors was performed using simulated sEMG data.To generate this simulated sEMG, we assume that: • the measurement noise has a fixed variance throughout the experiment, • the muscle is fully relaxed in the rest-phase of any trial, i.e., there is no sEMG activity from the target muscle during the rest-phase, and • the muscle is activated at a constant level for the entire duration of the move-phase.
These assumptions were made to evaluate the sEMG detectors under the conditions that: (a) the sEMG signal has a fixed signal-to-noise ratio (SNR) in the move-phase, and (b) all other intra-and inter-trial variabilities in the sEMG signal characteristics are minimized.The detectors that perform poorly under these ideal conditions will likely perform worse with real sEMG data from patients, since real sEMG data from severely affected stroke patients might occur in random bursts and is likely to have time-varying amplitude.
We simulated 100 trials of sEMG data with an individual trial duration of 13 seconds (8 s and 5 s for rest-and movephases, respectively).The sampling frequency of the simulated signal was set to 1000 Hz.Three different sEMG signal generation models were employed in the current analysistwo phenomenological (Gaussian and Laplacian) models and one biophysical model.

Phenomenological models: Gaussian and Laplacian
1][22][23] The exact probability density of the sEMG signal depends on the muscle activation level, with high levels of activation following a Gaussian distribution. 20,21However, at low levels of muscle activation, sEMG signals have been reported to follow a distribution that lies between a Gaussian and a Laplacian distribution. 24Therefore, to ensure that the detectors are tested with the appropriate signals, we generated data using both white Gaussian and white Laplacian signals, resulting in two phenomenological models.
The first step in this model is the generation of the zero-mean unit-variance white Gaussian and Laplacian noise e n ½ .A step change in the signal variance, within a trial, at the transition between the rest-and move-phases of a trial was obtained by multiplying e n ½ by σ n ½ : where σ 2 0 and σ 2 1 are the noise and signal variances, respectively.The noise variance is always set to σ 2 0 = 1 in this analysis, and the signal variance is chosen based on the desired signal-to-noise ratio (SNR) in the move-phase.The signal b e n ½ is then zero-phase bandpass filtered (8 th order FIR bandpass filter with cut-off frequency 10 Hz and 450 Hz) to have a signal with spectral characteristics like a sEMG signal.
x n where, h sf n ½ is the impulse response of the bandpass or shaping filter, and x n ½ is the generated sEMG signal that is used for the analysis.The signal-to-noise ratio (SNR) of this simulated sEMG signal in the move-phase is given by,

Biophysical model
In addition to the phenomenological models, we also wanted to test the detectors on more realistic data based on the biophysics of the sEMG signal, accounting for the physiological origin of the electrical muscle activity and the recording electrode geometry.In this paper, the biophysical model proposed in Ref. 25 was employed to generate the simulated sEMG data.Assuming a linear, isotropic volume conduction model, a simple muscle geometry with parallel muscles fibres ignoring the effects due to the finite muscle fibre length, the sEMG recorded by a bipolar electrode configuration can be approximated using the following expression: where Q is the number of motor units in the muscle, R q t ð Þ is the impulse train signal arriving at the q th motor unit through its corresponding motor neuron, M q is the number of muscle fibers in the q th motor unit, e q t ð Þ is the approximate electrode transfer function between the q th motor unit and the recording electrodes, and p t ð Þ is the single fibre action potential, which is assumed to be the same for all fibres.The full details of the model can be found in Devasahayam. 25with the associated parameters provided in Table S3 of the supplementary material.
The EMG simulator developed by Devasahayam 23 was employed in the current work to generate the simulated sEMG signals. 25A bipolar surface electrode configuration with a 10 mm interelectrode distance was considered.The simulator takes in the muscle force level as its input and computes the corresponding firing pattern for the motor units.In the current study, the force levels from the muscle were set to 0 N in the rest-phase (no muscle activation) and 10 N in the move-phase (average firing rate of 16.4 Hz for the muscle).The simulator generated pure muscle activity x pure n ½ recorded by the chosen electrode configuration.The force level for the muscle in the move-phase was chosen empirically to ensure that the temporal profile of the simulated sEMG signal x pure n ½ ,0 ≤ n ≤ N t À Á visually resembled that of real surface sEMG signals.A zero-mean white Gaussian noise e n ½ of fixed noise variance σ 2 0 was added to x pure n ½ to introduce measurement noise.The noise variance was chosen based on the signal power (σ 2 1 ) of x pure n ½ in the move-phase to obtain a signal with the desired SNR: Following this, the noisy signal x pure n ½ þe n ½ À Á is bandpass filtered (8 th order non-causal FIR filter) between 10 Hz and 450 Hz cut-off frequencies: where h sf n ½ is the impulse response of the bandpass shaping filter.The characteristics of sEMG generated from these three models are shown in Figure 1.We wanted to investigate the performance of the detectors under two conditions where the signal power was: (a) equal to noise power, and (b) less than the noise power.Thus, the current study employed two different SNRs of 0 dB (signal power equals noise power) and -3 dB (signal power is half the noise power).

Detection algorithms
The general structure for sEMG detectors proposed by Staude et al. 7 is shown in Figure 2, which consists of three steps carried out sequentially to map the given real-time sEMG data into a binary output: 1. Signal conditioning is the first step to improve sEMG signal quality for better detection, often involving highpass filtering for movement artefact removal.Some detectors might employ additional filtering operations, such as adaptive whitening for stable sEMG amplitude estimation. 26,27The conditioned signal is represented by where S Á ð Þ represents the mathematical operation performed by the signal conditioning step.

Test function computation transforms e x n
½ into a scalar variable or feature that can distinguish the presence or absence of sEMG.The test function g n ½ is computed at the current time instant n over a causal window of size W: Some examples of test functions in the literature include the moving average of e x n ½ , χ 2 test variable, 28 likelihood ratio 29 etc.

A decision rule is applied on the test function g n
½ by comparing it to a threshold h to identify the presence/ absence of an sEMG signal: The threshold h is adaptive and is calculated for each trial by adding α times the standard deviation of first 3 seconds data (Figure 3) to the mean.α is termed as the weight for threshold in this paper.Some detectors employ a more sophisticated decision rule, such as double thresholding, to control the false positive rates of detection. 28,30 note that each detector has a set of parameters associated with it.The current study compares the performance of 13 detector types reported in the literature which can be implemented in real-time, listed in Table 1.Each detector type fits into the general structure shown in Figure 2. The different parameters associated with these detector types are also Table 1.Description of the structure of the 13 detectors investigated in the current study, along with the different parameters associated with the individual detectors.

Test function Parameters
Modified Hodges

Rectification
Low pass filter TKEO 12 High pass filter  provided in Table 2.A detailed description of the individual detector types and the algorithms for their implementation are provided in the extended data (Table S1).All detector algorithms were obtained from the literature and implemented in MATLAB 33 with appropriate modifications required for real-time detection.

A measure of detector performance
The simulated sEMG data from the three different signal models and the two different SNRs were used to evaluate the performance of the different detector types.Each trial of sEMG signal (13 seconds long) was input to the different detectors to compute binary output indicating the presence or absence of sEMG signal.An optimal EMG detector designed for use in EMG-driven robot-assisted therapy should possess the capability to quickly identify the onset of sEMG, efficiently eliminate false positives, and consistently detect sEMG when it is present (lower false negative).Such a detector might be essential for maintaining the user's motivation and sense of agency.These are computed from the output y n ½ of each trial (Figure 3), where the sEMG signal from each trial was analysed in the following three steps: 1. the first three seconds (0-3s) of the rest-phase data is used for estimating the threshold h for detection: where μ g and σ g are the mean and standard deviation of the test function in this period.
2. the remaining 5 seconds of the rest-phase are used to estimate the false positive rate r FP ð Þ.
3. the 5 seconds of the move-phase are used to estimate the false negative rate r FN ð Þand the detection latency Δt ð Þ. Tolerance for distance 0.5 1.5 0.5 0.5 1.5 0.5 We defined a performance measure to compute a single number referred to as the cost of detection that considers the false positive rate r FP , the false negative rate r FN , and the detection latency Δt T be the cost vector associated with the output y n ½ of the detector for a particular trial.We define the cost of detection C as the infinity norm of the cost vector c.
providing the worst-case performance of the detector on the given trial.
The false positive rate r FP is defined as the ratio of the number of 1s in the detector output y n ½ in the rest-phase of a trial, and the false negative rate r FN is defined as the ratio of the number of 0s in y n ½ in the move-phase.
From Figure 3, the detection latency is defined as the time delay from the start of the move-phase to when the detector output goes to 1: where T s is the sampling period of data in milliseconds, and Δt∈ 0, 5000 ½ ms.The cost due to this latency is quantified by the function f Δt ð Þ that maps Δt to a real number in the closed interval between 0 and 1: Latencies between 0 to 250 ms have linearly increasing costs while the ones above 250 ms are considered as bad as 250 ms.Based on the definitions of r FP , r FN , and f Δt ð Þ, C∈ 0,1 ½ .A detector with a consistently lower cost of detection C would be considered a better detector.
Comparing different detector types A detector's performance or cost is determined by the SNR of the input signal, the detector type, and its associated parameters.Thus, for a fixed SNR input signal, comparing two detector types must be done only after controlling for the influence of their corresponding detector parameters.In the current work, this was done by first choosing the optimal parameters for each detector type, before comparing different detector types.The optimal parameters for a detector type were selected by first splitting the 100 movement trials of simulated sEMG data of the three models (which was generated as explained above) into training and validation datasets with 50 trials each.This was done for both SNRs (0 dB and -3 dB) and for all three signal models (Gaussian, Laplacian, and biophysical).The training dataset was used to identify the optimal parameter values for the different detector types, i.e. the values of the parameter combination that consistently resulted in the least cost for the detector on the training dataset.The exact procedure is given in Algorithm 1 (end of the document), while details are provided in the extended data.
After identifying the optimal parameter combination for each detector type, the optimal parameter values were used to run the detector on the 50 trials of the validation dataset, which resulted in the validation cost set for the detector type D. The cost set from the different detector types were compared using two-way ANOVA with the detector type and signal SNR as the two factors for each of the three-signal model.The complete code for the analysis can be found here.

Results
The entire analysisgeneration of the simulated data, the various detection algorithms, optimization 34 of the detector parameters, and analysis of the resultsreported in this paper were implemented in MATLAB R2020 (RRID: SCR_001622) (alternative languages could also recreate this study i.e.Python [RRID:SCR_008394] or GNU Octave [RRID:SCR_014398]).A sample of the individual trials from the three sEMG signal models is depicted in Figure 1.A sample output of the different processing stages of the Modified Hodges detector from a Gaussian sEMG signal trial is shown in Figure 3.The Modified Hodges detector, filters (2 nd order Butterworth low pass filter) the rectified raw sEMG signal (blue coloured trace in Figure 3).This lowpass filtered signal is the test function of the detector (red-coloured trace in Figure 3).The threshold h for this trial is shown by the green-coloured horizontal line in Figure 3.The output of the detector (black coloured trace in Figure 3) is 1 whenever the test function crosses the threshold, it is 0 otherwise.The figure also shows the values of r FP , r FN , Δt, and f Δt ð Þ for the trial.

Optimal parameters for the different detector types
The 13 detector types were compared after choosing the optimal parameter set for each detector type using the training dataset of 50 trials for each of the six combinations of the three signal models and two SNRs. 35This procedure is depicted in Figure 4 for the Modified Hodges detector for the 0 dB SNR Gaussian signal model, which shows the outcomes from the different steps in the optimization process described in Algorithm 1.
Þ to the cost C for the optimal parameter combination for the Modified Hodges detector.The values of the optimal parameters for the different detector types are listed in Table 2.

How do the different detector types perform on the different signal models and SNRs?
The performance of the different "optimal" detector types, i.e., detectors using the optimal parameter values, were compared using the 50 trials from the validation datasets.The boxplot of the performance of these different detector types for the three different signal models -Gaussian, Laplacian, and biophysicalare shown in Figure 5(a), (b), (c), respectively; each of these subplots displays the performance for the 0 dB and -3 dB SNRs in red and blue boxplots, respectively.Note that the order of the depiction of the different detectors is in terms of the increasing average cost across the three signal models and two SNRs; the detectors on the left are better than the ones on the right in an average sense.A two-way ANOVA on the effect of the detector type and SNR on performance revealed a significant difference between the detector types (p < 0.001) and SNRs (p < 0.001) for all three signal models.The test revealed a significant interaction between the factors for all three models (biophysical: p < 0.0001; Gaussian: p < 0.0001; Laplacian: p < 0.0001).These statistical results confirm the results shown in the boxplots in Figure 5, where the performance is different among the detector types, with consistently poorer performance for -3 dB compared to 0 dB.The costs for both 0 dB and -3 dB appear to be lower for the biophysical model compared to the Gaussian and Laplacian models.
Most detector types perform similarly except for the Sample entropy, Continuous wavelet transform (CWT), and Singular spectrum analysis (SSA) detector types which perform worse across the different signal models and SNRs.Among the other detector typesthe Modified Hodges, the approximate generalise likelihood ratio test-Gaussian (AGLR-G), and the approximate generalise likelihood ratio test-Laplacian (AGLR-L) detectorshave almost similar costs for the different signal models and SNRs.The other detector types -Root mean square (RMS), Hodges, Bonato, Lidierth, Modified Lidierth, Teager Kaiser Energy Operator (TKEO), and Fuzzy entropyhave slightly higher costs for one or more specific signal models and SNRs.We note that the Fuzzy entropy detector performs very well on the biophysical signal model for both SNRs with an acceptable cost for more than 95% of the validation trials.

Which detector types have an acceptable cost?
The choice of appropriate detector type(s) for use in robot-assisted therapy requires the specification of an acceptable cost of detection C accept .To this end, we specify the upper limits for the false positive rate, false negative rate, and the latency of detection as C accept ¼ 0:2, which corresponds to a detector with the following cost components: We believe that these upper limits are a reasonable compromise among the three competing factors determining the cost.Any detector type with costs consistently lower than C accept would be deemed an appropriate detector for use in robotassisted therapy.To determine the detector types with consistently lower costs than C accept , we computed the proportion ðr D accept Þ of the 50 validation trials with acceptable C ≤ C accept for the detector D for the three signal models and two SNRs using where C D i is the cost of detection for the i th validation trial for detector D. The value of r D accept for the different detector types is shown in Table 3, where the cells with r D accept ≥ 0:8 are highlighted.We can observe there that: 1.All detectors perform poorly for the -3 dB Laplacian signal model.The highest value of r D accept is 0.22 for this signal model, which interestingly is from the AGLR-L detector designed for the Laplacian signal.Many detectors perform a little better with higher r D accept values for the -3 dB Gaussian and biophysical signal models.
2. The Modified Hodges detector is the most consistent detector across the different signal models and SNRs.It has an r D accept > 0:8 for the three signal models at 0 dB and the biophysical model at -3 dB SNR.
3. The Fuzzy Entropy detector performs as well as the Modified Hodges detector for the Gaussian and biophysical signal models, but not on the Laplacian model.
4. In terms of the average value of r D accept , across the three signal models (last two columns of Table 3), the Modified Hodges detector performs the best for the 0 dB signals, followed by the AGLR-G, AGLR-L, and Fuzzy Entropy detectors which have slightly lower but similar performance.For -3 dB signals, the Modified Hodges, AGLR-G, and AGLR-L detectors result in similar performances.
Based on these observations, the Modified Hodges appears to be the most consistent detector for low SNR signal models, irrespective of the sEMG signal model.The two statistical detectors (AGLR-G, and AGLR-L) and the fuzzy entropy detectors provide similar but slightly lower performance than the Modified Hodges detector.

Discussion
Movement intention-triggered robot-assisted therapy is one of the options available for severely impaired patients without visible residual movement.sEMG for movement intent detection is a simpler, more direct and task-specific alternative to EEG-BCI. 6The investigation of sEMG-driven robot-assisted therapy requires a sensitive and robust method for the accurate and fast detection of movement intention from residual low SNR sEMG signals.This study systematically investigated existing sEMG detection algorithms in the literature until 2018.The investigation was carried out on  In this analysis, an acceptable cost 0.2 was chosen for application in sEMG-driven robot-assisted therapy; this corresponds to a latency of 50 ms, 20% FPR or 20%FNR; the low latency, and relatively high FPR and FNR can result in a more sensitive detector being chosen as the optimal detector.We do not believe this is a problem, because the raw output of this detector is unlikely to be used directly to drive the robot-assistance.Some form of low-pass or time-based filtering (like the one employed by Ramos-Murguialday et al. 15 ) will be employed to filter out short false positives/ negative pulses before using it to drive robotic assistance.This filtering operation reduce the FPR and FNR at the expense of introducing an additional latency; a delay of 200-300ms are well tolerated when reporting for sense of agency. 36The choice of amount of filtering of the chosen detector's output will need to be done through feedback from patients/users of the system.
The current study identified that the Modified Hodges detector performed consistently well with cost C ≤ 0:2 for at least 80% of the validation trials, across the different signal models and SNRs, except for the -3 dB Laplacian signal model, where all detectors fail.The modified Hodges detectora simplified version of the Hodges detectorperforms better than Hodges because it does not involve the additional averaging step in computing its test function.This reduces the detection latency for the modified Hodges detector without an appreciable increase in the false positive and false negative rates (Table S4 in the extended data in figshare (RRID:SCR_004328)).The AGLR-G, AGLR-L, and fuzzy entropy detectors perform slightly lower than modified Hodges but better than the rest of the detectors.The good performance of the statistical detectors agrees with that of Staude et al. even with the lower SNRs investigated in this study.The fuzzy entropy detector also performs well, unlike its counterpartsample entropy.The sample entropy algorithm in this study used the local estimate of the signal's standard deviation for normalizing the data.Sample entropy's poor performance with the local estimate of the standard deviation was previously reported by Zhang et al.Sample entropy performs well only with the global estimate of the signal's standard deviation. 14This is not suitable for real-time implementation as estimating the global standard deviation is a non-causal operation requiring the entire signal record.The use of the fuzzy similarity measure addresses this problem with sample entropy, allowing the fuzzy entropy detector to track changes in the overall signal amplitude.Interestingly, fuzzy entropy has a low cost of detection for both 0 dB and -3 dB biophysical signal models, which could be due to the additional structure of the motor unit action potentials (MUAPs) in the movephase of the biophysical signal.
Interestingly, the RMS detector we used previously to demonstrate the viability of sEMG as an alternative to detect movement intention in severely impaired chronic stroke subjects 6 was not one of the best performers, as seen in Figure 5 and Table 3.We note that the observed performance was for the RMS detector with optimized parameters (Table 2) using the training dataset.This optimized RMS detector had a relatively high false negative rate and higher detection latency which resulted in its poor performance.This could possibly explain the lack of agreement between the sEMG and EEG detectors we had observed in our previous study, and a more sensitive detector might have identified sEMG activity in a larger proportion of subjects.The current study results warrant further investigation with real sEMG data from severely impaired patients using other detectors, such as the modified Hodges, AGLR-G/L, and fuzzy entropy.
In general, most detectors have a relatively lower cost of detection for the biophysical signal model, compared to the Gaussian and the Laplacian signal models.The reasons for the better performance on the biophysical model are not entirely clear, except for the fuzzy entropy detector, which might be sensitive to the temporal structure of the simulated data (MUAP) from the biophysical model.One possibility is the difference in the spectra of the signals from the biophysical model compared to the Gaussian or Laplacian modes (Figure 1); more signal energy is concentrated in the lower frequencies for the biophysical model than in the Gaussian or Laplacian models.Most detectors compute their test functions through a lowpass filtering or averaging operation, which could retain a relatively larger portion of the signal in the biophysical model compared to the Gaussian and Laplacian ones, thus resulting in improved performance with the biophysical model.If this is correct, then the difference in performances between the biophysical and the Gaussian/ Laplacian models should disappear when an appropriate spectral shaping filter is used in the Gaussian and Laplacian models, yielding a spectrum like the biophysical model.Finally, among the Gaussian and Laplacian models, the relatively poorer performance with the Laplacian signal model could be due to the long tails of the Laplacian distribution.
The simulated data used in the current study relies on a step-change in the signal properties between the rest-and movephase, and an sEMG signal of fixed amplitude during the move phase.These assumptions will be violated when dealing with feeble surface sEMG signals recorded from impaired participants with no visible residual movements.In such participants, movement attempts are likely to produce intermittent bursts of sEMG activity with smooth transitions between the on and off states in the target muscles.The sEMG signal might have time-varying amplitude even when the participant can continuously activate the muscle for sufficient duration.Although based on idealized simulated sEMG data, the current results do provide some idea about the detector types that can potentially work on real low SNR sEMG signals; a detector performing poorly on ideal data is likely to perform worse with real data.Furthermore, the results from the current analysis also indicate that modified Hodges, AGLR-G, AGLR-L, and fuzzy entropy detectors are likely to pick up even bursts of sEMG signals since they have small detection latency (Δt ≤ 50 ms).
The detectors studied in this paper can be used for on-off control of robotic assistance, 37 where once sEMG activity is detected, robotic assistance drives the limb towards the target in a preprogramed fashion.The choice of the best control variable depends on which one of these is sensitive, robust, and provides a natural human-robot interaction with minimal lag.However, it should be noted that it is unclear how well severely impaired participants, with no visible residual movements, can finely modulate their sEMG activity and will require a screening procedure to evaluate the ability of the participant to modulate sEMG activity in the target muscle.
The study has limitations that are worth noting to ensure that the results are interpreted appropriately.The study entirely relies on simulated data to investigate the different detectors.The conclusions are thus only as good as the assumed signal models and how well they represent the residual sEMG signals of patients with no visible movements.This is the first study investigating detectors for low SNR sEMG, and thus the use of simulated data was essential to gain some understanding of the performance of the different detectors.Simulated data also allows complete control of the ground truth, which provides a more truthful characterization of different detectors' detection accuracy and latency.The use of three different signal models to investigate the different detectors also adds some robustness to the study's findings.Additionally, this analysis allows us to exclude the poorly performing detectors and identify the ones that warrant further investigation with real data.Another potential limitation of the use of simulated data is the availability of complete information about the ground truth against which the different detectors are compared.However, the results of the current study can't be verified with real data because we will never know the ground truth in the surface EMG from patients with no visible residual movements.This is a valid concern.Nevertheless, some form of an unsupervised approach will be required for verifying the results of the current study with real data.With real data, the best detector would be the one that consistently provides the maximum separation for the probability density function of the test function g n ½ from the different detectors under the rest-phase and move-phase.

Conclusions
This paper systematically investigated existing sEMG detection algorithms on low SNR sEMG signals simulated using three different signal models (two phenomenological -Gaussian, Laplacian models and a biophysical model) at two different SNRs (0 dB and -3 dB).The Modified Hodges detectora simplified version of the threshold-based Hodges detector, introduced in the current studywas found to be the most consistent detector across the different signal models and SNRs.This detector had false positive and false negative rates of lower than 20% and a detection latency of lower than 50 ms for almost 90% of the trials on which it was tested for 0 dB SNR and more than 40% of the trials for -3 dB SNR.The two statistical detectors (Gaussian and Laplacian Approximate Generalized Likelihood Ratio) and the Fuzzy Entropy detectors have a slightly lower performance than Modified Hodges.Overall, the modified Hodges, Gaussian and Laplacian approximate generalized likelihood ratio, and fuzzy entropy detectors were identified as potential candidates for further validation with real surface sEMG data on a population of severely impaired patients.The current study forms the first step towards developing a simpler, practical, and robust sEMG-based human-machine interface for triggered robot-assisted therapy in severely impaired patients.
Algorithm 1: Procedure for selecting the best parameter combination for the detector type.
Let the parameter set for the given detector type be Π ¼ p 1 , p 2 ,…, p m È É .
Set the parameter ranges for the individual parameters p i in the parameter set P for the detector type, which results in K different parameters combinations.-Compute the overall detector performance for the j th trial as P j Get the best parameter combination for the detector type as the following, First of all, I would like to confirm the detailled review of Vincent Crocher.But I would like to underline his criticism on the lengthy elaboration of the rehabilitation issue.The manuscript focusses on event detection in stochastic signals and, therefore, it can be mapped to the biosignal processing field.The EMG processing is only specificly addressed by the inclusion of the biophysical signal model.And the topic "robots in rehalitation" is a separate field not really related main issue of this manuscrpt.Thus, the total rehabilitation part -not only the BCI passages -can be reduced -as Vincent already states -to a single sentence and a removal of about 25 references.It sounds also contradicting to refer to the case "without residual movement" but to measure surface EMG.What presumably is addressed by the manuscript are combined muscle excitations; i.e. to generate stiffness in the arm when fingers are grasping an object.But such a stiffness in the forearm is ambiguous, because the same stiffness will be generated with different finger movements.I will not continue this rehabilitation issue, because it is not the concern of this manuscript.-as mentioned above.Summarizing: The main results are interesting and they should be provided in a compressed form.

If applicable, is the statistical analysis and its interpretation appropriate? Partly
Are all the source data underlying the results available to ensure full reproducibility?Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Biosignal Processing -human motor control I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
manuscript focusses on event detection in stochastic signals and, therefore, it can be mapped to the biosignal processing field.The EMG processing is only specificly addressed by the inclusion of the biophysical signal model.And the topic "robots in rehalitation" is a separate field not really related main issue of this manuscrpt.Thus, the total rehabilitation part -not only the BCI passages -can be reduced -as Vincent already states -to a single sentence and a removal of about 25 references.
Response: Our primary goal is to identify a suitable EMG detector for on-off control of robotic assistance during therapy.This paper focusses on the problem of event detection of EMG signals as this is the first step towards implementing an effective closed loop control of robotic assistance using EMG.The result of this analysis identifies the most promising detectors for application in EMG-driven robot assisted therapy.The "cost" for identifying the detectors is also designed accordingly.Therefore, in the introduction we have given the background about the necessity to identify the best detector that works well with low SNR EMG signals.However, we have reduced the introduction and discussion section which now highlights only the need for identifying the best detector for application such as EMG driven robot assisted therapy.
Original text: Substantial recovery of sensory-motor function after a stroke is possible with high-intensity and high-dosage movement training. 1Rehabilitation robots can facilitate such high-intensity movement training while providing physical assistance to the user for completing movements consistently and precisely.About 30% of stroke survivors are severely impaired 2 , 3 and require physical assistance to actively engage in movement training.While physical assistance from a robot can motivate such subjects to attempt and train movements, they can also provide inappropriately timed or too much assistance 4 , 5 leading to slacking, where patients reduce their voluntary effort and exploit robotic assistance to perform the movements.Inappropriately timed robotic assistance also alters the patient's sense of agency or subjective awareness of control.This could lead to a lack of intrinsic motivation and attention, affecting motor learning and performance. 6Positive therapeutic effects have been observed only when patients actively engage in therapy. 7 -10o et al. 11 reported no improvement in clinical scales with passive range of motion therapy.Thus, active patient participation during movement therapy is an essential ingredient for sensorimotor recovery.Robotic assistance contingent on a subject's intention to move is an effective way to guide neuroplasticity.

Reduced text in the manuscript:
Active patient participation during movement therapy is an essential ingredient for sensorimotor recovery after a stroke.About 30% of stroke survivors are severely impaired 2 , 3 and require physical assistance to actively engage in movement training.Robotic assistance can motivate such subjects to attempt and train movements and when it is contingent on the intention to move it may be an effective way to guide neuroplasticity.12 It sounds also contradicting to refer to the case "without residual movement" but to measure surface EMG.What presumably is addressed by the manuscript are combined muscle excitations; i.e. to generate stiffness in the arm when fingers are grasping an object.But such a stiffness in the forearm is ambiguous, because the same stiffness will be generated with different finger movements.I will not continue this rehabilitation issue, because it is not the concern of this manuscript.-as mentioned above.Summarizing: The main results are interesting, and they should be provided in a compressed form. ○

Response:
The study by Balasubramanian et al [1], about 70% of severely impaired stroke patients without residual movement in fingers and wrist had residual EMG.
The lack of residual movements despite the presence of EMG in stroke patients could be because of several reasons, one such is co-contraction in which case we could still pick up residual EMG.We agree with the reviewer that the same stiffness can be generated with different finger movements.However, our objective is not to decode stiffness, but to see if there is EMG activity in a muscle of interest -a muscle the patient is trying to activate while attempting a movement.
The patient could be activating other (antagonists or unrelated) muscles as well, but this is not relevant to the question of whether the muscle of interest displays EMG activity when it is being activated.We are primarily interested in the most appropriate detector for low SNR EMG from any given muscle, irrespective of what happens with other muscles.Once such a detector is identified, different forms of therapeutic interventions become possible, where patients could be asked to attempt activation of certain muscles while suppressing others.
The authors propose to evaluate and compare 13 sEMG onset detectors specifically on low signalto-noise EMG (0db and -3db).The authors propose a thorough work and use an appropriate methodology (including generation of simulated EMG signals, wide range of detectors and tuning, outcome evaluation cost and analysis) which is aligned with the objective of the paper and the application (the use of EMG onset for rehabilitation robotics control).
The paper is clearly of interest in the field of neuro-rehabilitation robotics and I believe that both the results and the methodology used (and associated code provided, which is reasonably documented) are of direct interest to the field.

Introduction:
The part on BCI drawbacks can be shorten into a simple sentence to keep the introduction more straightforward (this is not the focus of the paper).
○ "In this paper, EMG always refers to surface-recorded muscle": this indeed makes sense for the neuro-rehabilitation application of interest here and even more generally given the wider adoption and ease of use of surface EMG.I would thus suggest to bring this statement as early as in the introduction and adopt the sEMG acronym across the paper as this is an unambiguous terminology widely used.

Methods:
In the "Simulation of surface EMG signal" section, I would suggest to move the part of the last paragraph ("The phenomenological models were based on the work of De Luca...") to the next section to ease the reading.This part is relative to the Phenomenological models only.
○ Figure 1 presenting examples of the three types of generated signals show the biophysical signal of a smaller (20 times) than the phenomenological ones.While the absolute amplitude likely does not affect the detectors, this is slightly confusing.Is there any valid reason for this difference or can these three signals be equalised to the same amplitude?

○
In Table 1, authors refer to 'Weights' for \alpha whereas it appears that there is only a single value for each detector.Please clarify.

○
In Table 1, while RMS and Lidierth detectors do use a filter for the conditioning, those filters cut-off frequencies are not listed as parameters.Do authors intentionally chose fixed, nonoptimised, frequencies for those (and if so, why)? or is this a simple oversight in the table?

○
After Eq 12, authors should clearly introduce \alpha and refer to it as weight as this is the fist time it is properly introduced.I would also recommend to explicit it earlier on, as the Weight (weights?) are introduced in Table 1 earlier and earlier in the text but without explanation making it hard to follow.

Methods: On the cost definition:
According to Eq. 16, f(deltat) is 0 (ideal) when the detection happens before the actual onset: \deltat<0ms and so is effectively a False positive.While I understand that this cannot be avoided with the chosen structure of the cost: this suggests that the within the overall cost, the latency and the false negative rates are more redundant whereas the false negative rate is quite independent of the other two.

○
More generally, as the authors show that most detectors already behave quite poorly for such low SNR sEMG, the chosen cost bounds, and especially the 20% seems quite high for actual use.Given the application in mind where the objective is to detect one true onset as quickly as possible (and 50ms appears reasonable for a robotics assistance), I wonder if a cost accounting for a more strict false negative rate and more tolerant (if accounted at all) ○ for false positive would make more sense.Indeed, only the earliest positive detection after onset can be considered useful for an application using this signal as a trigger.Wouldn't it make sense to also train and evaluate the detectors on a cost based solely on a very strict false positive rate (5%?) and not included FN in the cost (given that only the earliest detection after true onset matters)?Regarding the FP and FN rates, those are also calculated per trial, meaning that 20% (or even 5%) can mean that each trial has at least several FP before onset and not that 20% (or 5%) of the trials do have a FP.This is this latest figure that is of direct interest for the application.Authors should clearly discuss this point on how these results translate to real scenarios and provide results showing in what percentage of test trials FP detection happens before onset.Discussion: It could be interesting to discuss the possibility that the signal processing (detectors) might not be necessarily implemented at the same frequency than the one the sEMG data is collected.Indeed it is common in existing systems to have an EMG acquisition faster than the robotic controller (typically a >2kHz EMG sampling for a 1kHz controller).Does authors have clues on how that might differently affect the different detectors?
○ While the use of simulated (generated) EMG signals to evaluate the performance of the detector is appropriate and allows for robust and repeatable outcomes, it would be an interesting addition to also perform a short evaluation on actual low signal-to-noise EMG datasets.
On that point of potential difference with actual EMG signals, authors rightly point that they here test the optimal behaviour of the detectors, i.e. they would likely all perform worst on actual signals, especially when muscle activation is not constant over a long period.Can authors infer and discuss some difference in performance between the detectors for short activation (typically flickers that can be encountered in rehabilitation applications)?Some detectors are based on relatively long windows of 150ms, and while authors explain in the discussion that the best detectors have a latency of <50ms and so likely suitable for short activation bursts, how this would affect the detectors with longer window?Could they be likely be optimised differently for such signals?

○
In Table 2, some window sizes are in (ms) some in (s)

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound?Yes

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate?Yes detection, which can result in inappropriately timed robotic assistance leading to suboptimal recovery 12 , 21 EEG-BCI modalities do not precisely identify which limb segment is intended to move due to its low information rate. 21vent-related desynchronization of the EEG sensorimotor rhythm may not necessarily reflect movement intention. 22ritical for practical use, current EEG-BCI systems are too complex and time-consuming for clinical work. 23ange in the manuscript: They often exhibit a poor signal-to-noise ratio, with significant trial-to-trial intra-subject variability [20].Unlike EMG systems, EEG-BCI modalities lack task specificity [21].Moreover, their complexity and time-consuming nature prevent their routine clinical use [23].
"In this paper, EMG always refers to surface-recorded muscle": this indeed makes sense for the neuro-rehabilitation application of interest here and even more generally given the wider adoption and ease of use of surface EMG.I would thus suggest to bring this statement as early as in the introduction and adopt the sEMG acronym across the paper as this is an unambiguous terminology widely used. ○

Response:
We have moved the sentence to the introduction and used "sEMG" throughout the paper.

Change in the manuscript:
In this paper, we consider surface-recorded muscle electromyogram activity (sEMG), as this modality will most likely be employed in routine robot-assisted therapy.sEMG could be used as a viable alternative to address the drawbacks of EEG-BCI for robot-assisted therapy.

Methods:
In the "Simulation of surface EMG signal" section, I would suggest to move the part of the last paragraph ("The phenomenological models were based on the work of De Luca...") to the next section to ease the reading.This part is relative to the Phenomenological models only.

Response:
We have moved the paragraph to the phenomenological models' section as suggested.

Change in the manuscript:
" The phenomenological models were based on the work of De Luca 37 where an sEMG signal from a muscle activated at a fixed level can be treated as zero-mean white noise followed by a shaping filter (electrode properties); this model is widely accepted in the literature.

Response:
The amplitude of biophysical model is in mV since it is modelled using physics of the EMG generation and recording process.Whereas with the phenomenological model the EMG signal is generated as bandpass filtered white noise, and thus its amplitude is in arbitrary units.We have now addressed this issue by scaling the biophysical models to have similar amplitudes to that of the phenomenological models to avoid any confusion.
In Table 1, authors refer to 'Weights' for \alpha whereas it appears that there is only a single value for each detector.Please clarify.

Response:
The optimal parameter for each detector type is selected by doing search over a range of values for each parameter.The performance of a detector is tested with different weights in the range α ∈{1,2,3,4,5}.However, only one value by detector is used for computing the threshold."Weights" in Table 1 has been modified as "Weight". In

Response:
The Lidierth detector utilizes rectification for signal conditioning and does not incorporate a low-pass filter.This oversight in the table has been rectified.On the other hand, the RMS detector, as per the original paper, employed a band-pass filter between 10 Hz and 250 Hz (fs/2) on recoded raw EMG data to obtain the EMG signal within the required frequency spectrum.However, in this analysis, simulated data was generated by band-pass filtering white noise between 10 and 450 Hz.Therefore, the band-pass filter for signal conditioning in the RMS detector was redundant.Hence, we implemented the detector without the bandpass filter and achieved similar results.
After Eq 12, authors should clearly introduce \alpha and refer to it as weight as this is the first time it is properly introduced.I would also recommend to explicit it earlier on, as the Weight (weights?) are introduced in Table 1 earlier and earlier in the text but without explanation making it hard to follow.

Response:
Thank you, we have followed this suggestion as follows: Change in the manuscript: The threshold h is adaptive and is calculated for each trial by adding α times the standard deviation of first 3 seconds data (Figure 3) to the mean.α is termed as weight for threshold in this paper.

Methods: On the cost definition:
According to Eq. 16, f(deltat) is 0 (ideal) when the detection happens before the actual onset: \deltat<0ms and so is effectively a False positive.While I understand that this cannot be avoided with the chosen structure of the cost: this suggests that the within the overall cost, the latency and the false negative rates are more redundant whereas the false negative rate is quite independent of the other two.

Response:
The response is with the understanding that the reviewer's comment is as follows: "This suggests that the within the overall cost, the latency and the false negative rates are more redundant whereas the false positive rate is quite independent of the other two." We performed a correlation analysis between false negative rate and the latency, which showed that: Moderate positive correlation between false negative rate and latency with 0 dB SNR signal of Gaussian and biophysical model (Figure 1).No correlation between FNR and latency with -3 dB SNR across all three models (Figure 2).○ Therefore, the three factors are independent at lower SNR and there is a need to account for them separately in order to evaluate the performance of the detector.Scatter plots of FNR vs. latency at both 0dB and -3dB for one of the detector Hodges are presented in Figures 1 and 2, respectively.Similar results are observed for all other detectors.Fig.

Response:
Firstly, we have modified the term EMG-triggered robot-assisted therapy to EMG-driven robot-assisted therapy to emphasise that we are interested in implementing a therapy modality where the robot-assisted movement is contingent on the presence or absence of sEMG.Continued assistance required the continued presence of sEMG.This has been modified in the manuscript.We have also changed "sEMG-triggered" to "sEMG-driven".

Original text:
In sEMG-triggered robot-assisted therapy, robotic assistance is triggered whenever sEMG is detected in real-time in the target muscle during the move phase of a trial.

Change in the manuscript:
In sEMG-driven robot-assisted therapy, the robot remains inactive during the rest phase, while in the move phase the robot-assisted movement is contingent upon the presence or absence of sEMG at any given time, thus continued sEMG is required to continuously receive robotic assistance.

Response:
Making robot-assistance contingent on continued presence of sEMG is important to keep the subject actively involved during the entire duration of a task, and thus avoid slacking.
Such a modality requires both the rejection of false positives, quick detection of sEMG onset, and the continued detection of sEMG when it is present (i.e.low false negatives).Low false negative rates may be essential for the user's motivation and sense of agency.For higher SNR signals, the presence of the false negative rate may not be important in the cost function, but it is crucial for low SNR signals (0dB or -3dB) including the false negative rate.Also, the sample entropy, SSA and CWT detectors that performed worse across all three models had acceptable false positive rate and latency, but the false negative rate alone was more than 20% which contributed to higher cost.These detectors will also be considered suitable for application in robot assisted therapy if the false negative rate is not included in the cost.This suggest that it is important to optimised for false negative rate as well since we are looking at low SNR EMG signals.
We have now included a short paragraph explaining why the three terms are included in the cost function:

Original text:
The performance of a detector must consider the accuracy (false positive and false negative rate) and the detection latency.

Change in the manuscript:
An optimal EMG detector designed for use in EMG-driven robot-assisted therapy should possess the capability to quickly identify the onset of sEMG, efficiently eliminate false positives, and consistently detect sEMG when it is present (lower false negatives).Such a detector is essential for maintaining user's motivation and sense of agency.

Response:
We agree that the 20% false positive rate might be too high.But since we are looking at very low SNR signal a strict FPR of 5% might decide all the detectors as not suitable for application in robot-assisted therapy.Also optimising for such low FPR could lead to parameter choices that compromise the true positive rate and latency since we are working with low SNR signals.Therefore, we have decided to use a more sensitive detector.We do not think that choosing a sensitive detector is a problem because the output of the detector will not be used to directly drive robot-assistance.Some form of low-pass filtering, or time-based filtering will be employed to smooth out jittery outputs from the detector before activating robotic-assistance.Such filtering operations on the detector output is commonly employed (e.g.Ramos-Murguialday et al., 2013).This can help filter out most short-duration false-positive pulses from activating robotic assistance.This filtering operation will increase the latency, which can impact the sense of agency and the perception of the user about the system's responsiveness.Previous work indicates that a 200-300ms delay is well tolerated when reporting agency.Starting with a sensitive detector, we can experiment with different levels of filtering of the output with feedback from patients to yield an optimal detector.Individualising the detector and the filtering block is likely to favour patients to feel in control of their own movements.

○
We have now included this in the Discussion to explain the 20% cut-off choice for determining a good detector.

Change in the manuscript:
In this analysis, a cost of 0.2 was selected for the application in sEMG-driven robot-assisted therap, corresponding to a 50 ms latency, 20% FPR or 20% FNR.Permitting up to a 20% false positive rate might be too high, leading to the selection of a highly sensitive detector as the best choice.However, we believe that choosing a sensitive detector is not a problem, because the raw output of this detector is unlikely to be used directly to drive the robot-assistance.Some form of low-pass or time-based filtering (like the one employed by Ramos-Murguialday et al. 1 ) will be employed to filter out show false positives/negative pulses before using it to drive robotic assistance.This filtering operation reduce the FPR and FNR at the expense of introducing an additional latency; a delay of 200-300ms are well tolerated when reporting for sense of agency 2 .The choice of amount of filtering of the chose detector's output will need to be done through feedback from patients/users of the system.A 20% false positive rate does not imply that 20% of the trials have a false positive rate.Instead, we assessed the consistency of the detectors by determining the number of test trials with a cost less than 0.2.A suitable detector is identified as one that has 80% of test trials with a cost less than 0.2, as indicated by the grey boxes in Table 3.These highlighted detectors exhibit more than 80% of test trials with at most a 20% false positive rate.This means that every trial could have false positives, and in our analysis the best detectors are the ones that consistently show less than 20% false positives in each test trial.

Discussion:
It could be interesting to discuss the possibility that the signal processing (detectors) might not be necessarily implemented at the same frequency than the one the sEMG data is collected.Indeed, it is common in existing systems to have an EMG acquisition faster than the robotic controller (typically a >2kHz EMG sampling for a 1kHz controller).Do authors have clues on how that might differently affect the different detectors? ○

Response:
Although the EMG acquisition happens at faster rate when compared to the robotic controller, the performance of the different detectors is likely to be not affected by the sampling rate.Consider that the EMG detectors are implemented at the same frequency as the EMG acquisition system (1 kHz) but the robotic controller is running at lower sampling frequency of 100 Hz.Then the output of the EMG detector is to be down sampled to match the frequency of the robotic controller which might lead to increase in latency by few milliseconds (10 ms) which might not affect the overall behaviour of the EMG-triggered robot assistance system.While the use of simulated (generated) EMG signals to evaluate the performance of the detector is appropriate and allows for robust and repeatable outcomes, it would be an interesting addition to also perform a short evaluation on actual low signal-to-noise EMG datasets.

Response:
We believe that evaluating the performance of the detector with real EMG datasets is outside the scope of this paper, in part as the same analysis approach cannot be applied with real data since we do not have full control over the amplitude and timing of the real EMG data, and we might not always have the ground truth.However, we are investigating if the outcomes of this study are supported by real data from patients from a previously concluded study.
On that point of potential difference with actual EMG signals, authors rightly point that they here test the optimal behaviour of the detectors, i.e. they would likely all perform worst on actual signals, especially when muscle activation is not constant over a long period.Can authors infer and discuss some difference in performance between the detectors for short activation (typically flickers that can be encountered in rehabilitation applications)?Some detectors are based on relatively long windows of 150ms, and while authors explain in the discussion that the best detectors have a latency of <50ms and so likely suitable for short activation bursts, how this would affect the detectors with longer window?Could they be likely be optimised differently for such signals? ○

Response:
From Table 3, we can see that the detectors that have more than 80% validation trials with cost consistently lower than 0.2 (latency < 50ms) have a window size of less than or equal to 100 ms as the optimised parameter (Table 2).AGLR-G detector under biophysical model (0dB and -3dB) and Laplacian model (-3dB) with window size of 150 ms do not fall under the detectors with acceptable cost.This shows that detectors with a larger window size are less likely suitable for short activation bursts.However, on an average across all three models AGLR-G detector had reasonable percentage of validation trials with cost less than 0.2, thus we have mentioned in the Discussion that it is also one of the detectors that is likely to pick up short bursts.
We conducted an analysis on EMG signals with short pulses of 500 ms duration.The results indicated that optimised parameters differed for the step EMG and pulse EMG.We also included the offset time in evaluating the performance of the detector.When optimising for offset time, higher weights for the threshold and a greater cut-off frequency for the lowpass filter (smaller window sizes) were selected to minimise the offset time latency.Hence, the detectors that were optimised with step EMG with relatively longer window size may exhibit suboptimal performance on EMG, with intermittent muscle activity as they might miss to detect the short activation burst.

○
In Table 2, some window sizes are in (ms) some in (s) ○

Response:
These have been corrected in the manuscript.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com

Figure 1 .
Figure 1.Characteristics of the sEMG signals generated from the three models.The three rows correspond to the three different signal models: Biophysical in the top row, Gaussian in the middle, and Laplacian in the bottom row.The left most column shows the time series of the simulated 13 seconds of data with the first 8 seconds corresponding to the rest phase and the next 5 seconds to the move phase.The middle column shows the corresponding Fourier magnitude spectrum of the 5 seconds of move phase data.The right column displays the estimate of the probability density functions of the 5 seconds of move phase data from the three models.

Figure 2 .
Figure 2. A general structure for sEMG detectors as proposed by Staude et al.7

7
Figure 2. A general structure for sEMG detectors as proposed by Staude et al.7 e

Figure 3 .
Figure 3.A representative example of a trial from the Gaussian signal model with -3 dB SNR run through the Modified Hodges detector.The plot shows the rectified sEMG signals, its lowpass filtered output, and the binary output from the detector.The trial is 13 seconds long with first 8 seconds corresponding to the rest-phase and the next 5 seconds to the move-phase.The rest-phase is further divided into the baseline phase (yellow background) that is used for computing the threshold h, and the remaining rest-phase (red background) is used for computing r FP .The move-phase (green background) is used to compute Δt and r FN .

Figure 4 (
a) shows the histograms of the cost C for the different parameter combinations, in light blue traces.These histograms are estimated from the cost values C i f g 50 i¼1 obtained from the 50 trials in the training dataset for the different combinations of the detector parameters.The scatter plot of the median c med and inter-quartile range c iqr of these histograms are shown in Figure 4(b).The choice of the best parameter for the detector was determined to be the one with the least Euclidean norm ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi c 2 med þ c 2 iqr q which is shown as the red circle in Figure 4(b); its corresponding histogram is shown in the thick red trace in Figure 4(a).Figure 4(c) shows the marginal histograms of the individual contributors r FP ,r FP

Figure 4 .
Figure 4. Outcomes of the parameter optimization process for the Modified Hodges detector as described in Algorithm 1.(a) Estimated probability density function of cost for the different combinations of parameters (in light blue traces).The red trace corresponds to the cost of the optimum parameter combination.(b) Scatter plot of median vs IQR for the cost of different parameter combinations.(c) Estimated histograms of latency, false positive rate, false negative rate and cost of the optimum parameter.

Figure 5 .
Figure 5. Boxplot of cost of performance of the different detectors in the validation datasets from the different signal models and SNRs.All detectors shown in this plot use the optimal detector parameter sets optimized on the training dataset.(a) Cost of detection for the Gaussian signal model, (b) Laplacian signal model and (c) Biophysical signal model.The red and blue colored boxplots are of 0 dB and -3 dB SNR, respectively.The red dashed line is the acceptable cost C accept ¼ 0:2.
using three different signal models with low SNR of 0 dB and -3 dB.These SNRs correspond to feeble sEMG signals compared to regular sEMG recordings from healthy individuals.Using three different signal modelstwo phenomenological and one biophysicalmakes the study results robust to assumptions about the simulated sEMG data.The study byStaude et al. published in 20017 compared different sEMG detectors for accurate sEMG onset-time detection.They employed a Gaussian signal model with ramp variance profiles (with varying slopes) at SNR of 3 dB to 12 dB in their analysis7 and found the AGLR statistical detector to be the best in terms of onset detection, while the Hodges detector performed poorly.7Although there are some similarities between the current study and those of Staude et al., the two differ in several ways: (a) the current study is focused on real-time detection, while Staude et al.'s primary goal was offline analysis; (b) the current study employed lower SNR signals, which is important considering its application to detect motion intention in severally affected stroke patients; (c) the current study tested three different signal models, while Staude et al. used only the Gaussian signal model; (d) the primary performance measure in Staude et al. was onset detection latency, while the current study used a composite performance measure (or cost) consisting of the false positive rate, false negative rate, and detection latency; (e) the rationale for the choice of the specific detector parameters was not explicitly mentioned in Staude et al.In the current study, the detector parameters were optimized through a brute force search to ensure the best detectors from each detector type were compared; and (f) the current study investigates a wider class of detector types than Staude et al., including the detectors published after 2001.

For j ¼ 1
to K parameter combinations: -Compute the output of the detector y i n ½ È É 50 i¼1 for the chosen combination of parameter values for each of the 50 trials in the training dataset.-Compute the cost C i f g 50 i¼1 for each of the 50 trials.-Compute the median c med and inter-quartile range c iqr of the cost values from the 50 trials.

Table 3 .
The proportion of the 50 validation trials with cost less than the acceptable cost of 0.2 for different detector types, signal models, and SNRs.The cells with proportions greater or equal to 0.8 are highlighted in gray.

Table 1 ,
while RMS and Lidierth detectors do use a filter for the conditioning, those filters cut-off frequencies are not listed as parameters.Do authors intentionally chose fixed, non-optimised, frequencies for those (and if so, why)? or is this a simple oversight in the table?
1. Correlation plot of false negative rate vs. latency of Hodges detector tested on the three signals model (from left: Gaussian, Laplacian and Biophysical) with 0dB SNR.Fig.2: Correlation plot of false negative rate v.s latency of Hodges detector tested on the three signal model (from left: Gaussian, Laplacian and Biophysical) with -3 dB SNR.The figures can be found here More generally, as the authors show that most detectors already behave quite poorly for such low SNR sEMG, the chosen cost bounds, and especially the 20% seems quite high for actual use.Given the application in mind where the objective is to detect one true onset as quickly as possible (and 50ms appears reasonable for a robotics assistance), I wonder if a cost accounting for a more strict false negative rate and more tolerant (if accounted at all) for false positive would make more sense.Indeed, only the earliest positive detection after onset can be considered useful for an application using this signal as a trigger.Wouldn't it make sense to also train and evaluate the detectors on a cost based solely on a very strict false positive rate (5%?) and not included FN in the cost (given that only the earliest detection after true onset matters)?Regarding the FP and FN rates, those are also calculated per trial, meaning that 20% (or even 5%) can mean that each trial has at least several FP before onset and not that 20% (or 5%) of the trials do have a FP.This is this latest figure that is of direct interest for the application.Authors should clearly discuss this point on how these results translate to real scenarios and provide results showing in what percentage of test trials FP detection happens before onset.○